WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 

C07H 21704, C07K 14/705, C12N 15/09, 
15/63, C12Q 1/68 



Al 



(11) International Publication Number: WO 99/64436 

(43) International Publication Date: 16 December 1999 (16.12.99) 



(21) International Application Number: PCT/US99/12773 

(22) International Filing Date: 8 June 1999 (08.06.99) 



(30) Priority Data: 
60/089,098 



12 June 1998 (12.06.98) 



US 



(71) Applicant (for all designated States except US): MERCK & 

CO., INC. [US/US]; 126 East Lincoln Avenue, Rahway, NJ 
07065 (US). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): FEIGHNER, Scott, D. 
[US/US]; 126 East Lincoln Avenue, Rahway, NJ 07065 
(US). PATCHETT, Arthur, A. [US/US]; 126 East Lincoln 
Avenue, Rahway, NJ 07065 (US). TAN, Carina [MY/US]; 
126 East Lincoln Avenue, Rahway, NJ 07065 (US). MC- 
KEE, Karen [US/US]; 126 East Lincoln Avenue, Rahway, 
NJ 07065 (US). MACNEIL, Douglas [US/US]; 126 East 
Lincoln Avenue, Rahway, NJ 07065 (US). HOWARD, An- 
drew, D. [US/US]; 126 East Lincoln Avenue, Rahway, NJ 
07065 (US). PONG, Sheng-Shung [US/US]; 126 East Lin- 
coln Avenue, Rahway, NJ 07065 (US). SMITH, Roy, G. 
[GB/US]; 126 East Lincoln Avenue, Rahway, NJ 07065 
(US). 



(74) Common Representative: MERCK & CO., INC.; 126 East 
Lincoln Avenue, Rahway, NJ 07065 (US). 



(81) Designated States: CA, JP, US, European patent (AT, BE, CH, 
CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, 
PT, SE). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: CLONING AND IDENTIFICATION OF THE MOTILIN RECEPTOR 
(57) Abstract 

The motilin receptor has been isolated and cloned, and nucleic acid sequences are given. Two splice variants have been identified. 
Also, assays for motilin receptor ligands are given. The identification of the cloned motilin receptor may be used to screen and identify 
compounds which bind to the receptor for use in a variety of gastric conditions, including gastric motility disorders. 



BNSDOCID. <WO 9964436A1 I > 



FOR THE PURPOSES OF INFORMATION ONLY 
Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


AM 


Armenia 


FI 


Finland 


AT 


Austria 


FR 


France 


AU 


Australia 


GA 


Gabon 


AZ 


Azerbaijan 


GB 


United Kingdom 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


BB 


Barbados 


GH 


Ghana 


BE 


Belgium 


GN 


Guinea 


BF 


Burkina Faso 


GR 


Greece 


BG 


Bulgaria 


HU 


Hungary 


BJ 


Benin 


IE 


Ireland 


BR 


Brazil 


IL 


Israel 


BY 


Belarus 


IS 


Iceland 


CA 


Canada 


IT 


Italy 


CF 


Central African Republic 


JP 


Japan 


CG 


Congo 


KE 


Kenya 


CH 


Switzerland 


KG 


Kyrgyzstan 


CI 


Cdte d' I voire 


KP 


Democratic People's 


CM 


Cameroon 




Republic of Korea 


CN 


China 


KR 


Republic of Korea 


CU 


Cuba 


KZ 


Kazakstan 


CZ 


Czech Republic 


LC 


Saint Lucia 


DE 


Germany 


LI 


Liechtenstein 


DK 


Denmark 


LK 


Sri Lanka 


EE 


Estonia 


LR 


Liberia 



LS 


Lesotho 


SI 


Slovenia 


LT 


Lithuania 


SK 


Slovakia 


LU 


Luxembourg 


SN 


Senegal 


LV 


Latvia 


sz 


Swaziland 


MC 


Monaco 


TD 


Chad 


MD 


Republic of Moldova 


TG 


Togo 


MG 


Madagascar 


TJ 


Tajikistan 


MK 


The former Yugoslav 


TM 


Turkmenistan 




Republic of Macedonia 


TR 


Turkey 


ML 


Mali 


TT 


Trinidad and Tobago 


MN 


Mongolia 


UA 


Ukraine 


MR 


Mauritania 


UG 


Uganda 


MW 


Malawi 


US 


United States of America 


MX 


Mexico 


uz 


Uzbekistan 


NE 


Niger 


VN 


Viet Nam 


NL 


Netherlands 


YU 


Yugoslavia 


NO 


Norway 


zw 


Zimbabwe 


NZ 


New Zealand 






PL 


Poland 






PT 


Portugal 






RO 


Romania 






RU 


Russian Federation 






SD 


Sudan 






SE 


Sweden 






SG 


Singapore 







BNSDOCID: <WO 9964436 A 1 I > 



WO 99/64436 



PCTAJS99/12773 



TITLE OF THE INVENTION 
CLONING AND IDENTIFICATION OF THE MOTILIN RECEPTOR 

CROSS-REFERENCE TO RELATED APPLICATIONS 
5 xxxxx 

STATEMENT REGARDING FEDERALLY-SPONSORED R&D 
xxxxx 

10 REFERENCE TO MICROFICHE APPENDK 
xxxxx 

FIELD OF THE -INVENTION 

The present invention is directed to a novel human DNA 
15 sequence encoding a motilin receptor, the receptor encoded by the 
DNA, and the uses thereof. 

BACKGROUND OF THE INVENTION 

Gastrointestinal (GI) motility is a coordinated neuromuscular 
20 process which transports nutrients through the digestive system. 

Impaired GI motility, can lead to irritable bowel syndrome, constipation 
and diabetic and post-surgical gastroporesis and is one of the largest 
health care burdens of industrialized nations. Motilin, a 22 amino acid 
prokinetic peptide is expressed throughout the gastrointestinal tract in a 
25 number of species including humans. Released from endochromafffin 
cells of the small intestine, motilin exerts a profound effect on gastric 
motility with the induction of interdigestive (phase HI) antrum and 
duodenal contractions. The unrelated macrolide antibiotic 
erythromycin also possesses prokinetic properties mediated by its 
30 interaction with motilin receptors. These account for erythromycin's 
GI side-effects, including vomiting, nausea, diarrhea and abdominal 
muscular discomfort. 

Motilin receptors have been detected in the GI tract and recently 
in the central nervous system, but their molecular structure has not been 
35 reported. Although motilin receptor characterization has been actively 
pursued in humans and other species since the isolation of motilin from 
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porcine intestine in 1972, the receptor itself has not been-isolated nor 
cloned. 

Motilin is highly conserved across species and is synthesized as 
part of larger pre-prohormone. Mature 22 amino acid motilin is 

5 generated by removal of its secretory signal peptide and cleavage at the 
first C-terminally located dibasic prohormone convertase recognition 
site. Using radioligand binding, autoradiography and in vitro biossays, 
high affinity and low density, motilin receptors were detected in smooth 
muscle cells of the gastrointestinal tract of humans, cats and rabbits. 

10 Cerebellar brain Teceptors for motilin were also described supporting 
the notion that motilin may act in the central nervous system. Native 
motilin receptors appear to be coupled to G proteins and activate the 
phosphlipase C signal tranduction pathway resulting in Ca2+ influx 
through L-type channels. 

15 The development of safe and selective motilin receptor agonists is 

likely to aid the treatment of disorders resulting from impaired Gl 
motility. Thus, it would be desirable to be able to isolate, and clone the 
motilin receptor, and to use this in assays for agonists and antagonists. 

20 SUMMARY OF THE INVENTION 

The present invention is directed to a novel G -protein 
coupled receptor (GPCR), designated as motilin receptor. Two spliced 
forms of the motilin receptor were identified: MTL-R1 A, which 
encodes a functional seven-transmembrane domain form, and MTL- 

25 RIB, which encodes a truncated five-transmembrane domain form. 
Both forms make up embodiments of this invention. 

Another aspect of this invention are nucleic acids which 
encode the motilin receptor, which are isolated, or free from associated 
nucleic acids. 

30 Other aspects of this invention include assays for 

identifying motilin ligands which are agonists and antagonists of a 
motilin receptor comprising contacting a candidate ligand with a motilin 
receptor and determining if binding occurred. 

Another aspect of this invention is a method for 

35 determining whether a ligand is capable of binding to a motilin receptor 
comprising: 

-2- 
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(a) transfecting test cells with an expression vector 
encoding motilin receptor; 

(b) exposing the test cells to the ligand; 

(c) measuring the amount of binding of the ligand 
5 to the motilin receptor; 

(d) comparing the amount of binding of the ligand 
to the motilin receptor in the test cells with the amount of binding of the 
ligand to control cells that have not been transfected with the motilin 
receptor 

10 where if the amount of binding of the ligand to the test cells 

is greater than the amount of binding of the ligand to the control cells, 
then the substance is capable of binding to motilin receptor. 

BRIEF DESCRIPTION OF THE FIGURES 
15 Figure 1 shows the DNA sequence of motilin receptor gene 

including 5' untranslated region (SEQ.ID.NO.:!). lntronic sequences 
are shown in lower case type. 

Figure 2 shows the DNA sequence of motilin receptor 
spliced form A (MTL-R 1 A) (SEQ:ID.NO.:2). 
20 Figure 3 shows deduced amino acid sequence of MTL-R1 A 

(SEQ.ID.NO.:3). . 

Figure 4 shows the DNA sequence of motilin receptor 
spliced form B (MTL-R1B) (SEQ.ID.NO.:4). 

Figure 5 shows the deduced amino acid sequence of MTL- 
25 RIB (SEQ.ID.NO.:5). 

Figures 6 A-C compare DNA and protein sequence for 
MTL-R 1 A and MTL-R IB. 

Figure 7 shows the DNA sequence of puffer fish clone 
75E7 (SEQ.ID.NO.:6). 
30 Figure 8 shows the deduced amino acid sequence of puffer 

fish clone 75E7 protein sequences (SEQJD.NO.:7). 

Figure 9 shows the comparison of human MTL-R 1 A and 
puffer fish clone 75E7 protein sequences. 

Figure 10 is a graph illustrating the pharmacological 
35 characterization of the cloned MTL-R 1 A in the aequorin 
bioluminescence assay in HEK-293 cells. 

-3- 
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Figure 1 1 is a graph illustrating the pharmacological 
characterization of the cloned MTL-R1A in the [125l]-Tyr7 -human 
motilin binding assay. 

5 As used throughout the specification and claims, the 

following definitions apply: 

"Substantially free from other proteins" means at least 
90%, preferably 95%, more preferably 99%, and even more preferably 
99.9%, free of other proteins. Thus, for example, a MTL-R1 protein 
10 preparation that is substantially free from other proteins will contain, as 
a percent of its total protein, no more than 10%, preferably no more 
than 5%, more preferably no more than 1%, and even more preferably 
no more than 0.1 %, of non- MTL-R1 proteins. Whether a given MTL- 
Rl protein preparation is substantially free from other proteins can be 
15 determined by such conventional techniques of assessing protein purity 
as, e.g., sodium dodecyl sulfate polyacrylamide gel electrophoresis 
(SDS-PAGE) combined with appropriate detection methods, e.g., silver 
staining or immunoblotting. 

"Substantially free from other nucleic acids" means at least 
20 90%, preferably 95%, more preferably 99%, and even more preferably 
99.9%, free of other nucleic acids. Thus, for example, a MTL-R1 DNA 
preparation that is substantially free from other nucleic acids will 
contain, as a percent of its total nucleic acid, no more than 10%, 
preferably no more than 5%, more preferably no more than 1%, and 
25 even more preferably no more than 0.1%, of non- MTL-R1 nucleic 

acids. Whether a given MTL-R1 DNA preparation is substantially free 
from other nucleic acids can be determined by such conventional 
techniques of assessing nucleic acid purity as, e.g., agarose gel 
electrophoresis combined with appropriate staining methods, e.g., 
30 ethidium bromide staining, or by sequencing. 

"Functional equivalent" means a receptor which does not 
have the exact same amino acid sequence of a naturally occurring 
motilin receptor, due to alternative splicing, deletions, mutations, or 
additions, but retains at least 1%, preferably 10%, and more preferably 
35 25% of the biological activity of the naturally occurring receptor. Such 
derivatives will have a significant homology with a motilin receptor and 

- 4 - 
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can be detected by reduced stringency hybridization with a DNA 
sequence obtained from a motilin receptor. The nucleic acid encoding a 
functional equivalent has at least about 50% homology at the nucleotide 
level to a naturally occurring receptor nucleic acid. 
5 "Ligand" means any molecule which binds to a motilin 

receptor of this invention. These ligands can have either agonist, partial 
agonist, partial antagonist or antagonist activity. 

DETAILED DESCRIPTION OF THE INVENTION 

10 The cloning of GPCR's related to the hypothalamic and 

pituitary receptor for the growth hormone (GH) secretagogues (GHSs) 
which mediate sustained pulsatile GH release has been recently 
described. (McKee et. a/., 1997 Genomics 4(5:426-434, which is hereby 
incorporated by reference). One of these clones, GPR38, possessed the 

15 most significant amino acid sequence identity to the human GHSR (52%) 
(rising to as high as 86% in transmembrane domains (TM). GPR38 was 
classified as an orphan GPCR (GPCRs for which a natural ligand has 
not been identified). 

GPR38 was isolated from a human genomic DNA library 

20 and contained a single intron of approximately 1 kb, as shown in 
FIGURE 1. cDNA clones were isolated to obtain the nucleotide 
sequence of correctly spliced GPR38 mRNA. Efforts to isolate cDNA 
clones by standard library screening proved unsuccessful. 

A combination of RACE and RT-PCR techniques resulted 

25 in the identification of two spliced forms for GPR38. These two 

GPR38 cDNAs use distinct splice donor sites and a common acceptor 
site (perfect match to consensus exon-intron splice acceptor junction 
sequence [pyrimidine-rich stretch ag/TG]). GPR38-A mRNA (imperfect 
match to consensus donor sequence [TGC/gt]) encodes a polypeptide of 

30 412 amino acids with seven alphahelical TM domains, the hallmark 
feature of GPC-Rs, whereas GPR38-B encodes a 363 amino acid 
polypeptide with five TM domains (perfect donor sequence [CCG/gt]). 
Northern blot analysis failed to reveal an expression profile for GPR38. 
However, when RNase protection was employed expression was 

35 demonstrated in stomach, thyroid and bone marrow. 

-5- 
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It accordance with this invention, it has been found that 
GPR38 is the rnotilin receptor. Thus, this invention is directed to the 
human motilin receptor, its functional equivalents, rnotilin receptors 
from other species which can be isolated using fragments of the human 
5 rnotilin DNA as probes, and to splice varients of the rnotilin receptor. 

The intact motilin receptor of this invention was found to 
have structural features which are typical of G-protein linked receptors, 
including seven transmembrane (TM) domains, three intra- and 
extracellular loops, and the GPCR protein signature sequence. The TM 
10 domains and GPCR protein signature sequence are noted in the protein 
sequences of the GPCR in Figures 6A-C. 

A high-throughput assay was developed which measures 
Ca2+ realease with the bioluminescent Ca2+ sensitive-aequorin reporter 
protein (capable of measuring ligand-induced EP3-coupled mobilization 
15 of intracellular calcium and concomitant calcium-induced aequorin 
bioluminescence). Expression of cloned GPR38-A in cell membranes 
was confirmed using epitope-tagged protein which revealed a single 
protein species with a molecular weight of approximately 45,000 
daltons containing an open reading frame encoding 412 amino acids 
20 (SEQ. ED.NO.:3). The DNA and deduced amino acid sequence are 
given in SEQ.ID. NO.:2 and SEQ.ID. NO.:3, respectively. 

A broad set of peptide and non-peptide molecules were 
tested at a single concentration in transiently transfected HEK-293/aeql7 
cells (100 nM peptide, 10 |iM non-peptide). Significant bioluminescent 
25 responses were recorded for the peptide motilin and the non-peptide 
macrolide erythromycin, which was reported to be a competitive 
agonist at motilin receptors. Full dose-response curves confirmed this 
observation. 

Nucleotide sequence analysis revealed two splice fonns of 
30 human motilin receptor both of which make up further aspects of this 
invention. The first (MTL-R1 A) encodes a seven transmembrane 
domain receptor. The full length open reading frame appears to contain 
412 amino acids. The second splice form (MTL-R1B) diverges in its 
nucleotide sequence from MTL-R1 A just before the predicted amino 
35 acid of the sixth transmembrane domain (TM6). 

-6- 
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In the MTL-R1B, TM5 is truncated and fused to a 
contiguous reading frame of about 86 amino acids, followed by a 
translation stop codon. The DNA and amino acids sequences encoding 
MTL-R1A and MTL-R1B are given in FIGURES 2-5. 
5 A further aspect of this invention is a related motilin 

receptor gene, evident in the teleost puffer fish Spheroides nephelus. 
Screening of a puffer fish genomic library identified a single clone 
(75E7) containing an open reading frame of 363 amino acids 
(approximately 54% identical at the protein level) which contains a 

10 similar exon-intron structure to GPR38. Analysis of clone 75E7 shows 
an amino acid sequence to contain 363 amino acids with a molecular 
weight of approximately 41,323 daltons. (FIGURE 8). DNA sequence 
of puffer fish clone 75E7 is given in SEQ.ID.NO.:6, and a comparison 
of human MTL-R1 A and puffer fish clone 75E7 protein sequences is 

15 given in FIGURE 9. 

Another aspect of this invention relates to vectors which 
comprise nucleic acids encoding a motilin receptor or a functional 
equivalent. These vectors may be comprised of DNA or RNA; for most 
cloning purposes DNA vectors are preferred. Typical vectors include 

20 plasmids, modified viruses, bacteriophage and cosmids, yeast artificial 
chromosomes and other forms of episomal or integrated DNA that 
encode a motilin receptor. It is well within the skill of the ordinary 
artisan to determine an appropriate vector for a particular gene transfer 
or other use. 

25 A further aspect of this invention are host cells which are 

transformed with, a gene which encodes a motilin receptor or a 
functional equivalent. The host cell may or may not naturally express a 
motilin receptor on the cell membrane. Preferrably, once transformed, 
the host cells are able to express the motilin receptor or a functional 

30 equivalent on the cell membrane. Depending on the host cell, it may be 
desirable to adapt the DNA so that particular codons are used in order 
to optimize expression. Such adaptations are known in the art, and these 
nucleic acids are also included within the scope of this invention. 
Generally mammalian cell lines, such as HEK-293, COS , CHO, HeLa, 

35 NS/), CV-1 , GC, GH3 or VERO cells are preferred host cells, but other 

-7- 
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cells and cell lines such as Xenopus oocytes or insect cells, may also be 
used. 

Human embryonic kidney (HEK 293) cells and Chinese 
hamster ovary (CHO) cells are particularly suitable for expression of 

5 motilin receptor proteins because these cells express a large number of 
G-proteins. Thus, it is likely that at least one of these G-proteins will be 
able to functionally couple the signal generated by interaction of motilin 
receptors and their ligands, thus transmitting this signal to downstream 
effectors, eventually resulting in a measurable change in some assayable 

10 component, e.g., cAMP level, expression of a reporter gene, hydrolysis 
of inositol lipids, or intracellular Ca2+ levels. 

A variety of mammalian expression vectors can be used to 
express recombinant motilin in mammalian cells. Commercially 
available mammalian expression vectors which are suitable include, but 

15 are not limited to, pCR2.2 (Invitrogen), pMClneo (Stratagene), pSG5 
(Stratagene), pcDNAI and pcDNAIamp, pcDNA3, pcDNA3.1, pCR3.1 
(Invitrogen), EBO-pSV2-neo (ATCC 37593), pBPV-1 (8-2) (ATCC 
371 10), pdBPV-MMTneo(342-12) (ATCC 37224), pRSVgpt (ATCC 
37199), pRSVneo (ATCC 37198), and pSV2-dhfr (ATCC 37146). 

20 Following expression in recombinant cells, motilin receptors can be 
purified by conventional techniques to a level that is substantially free 
from other proteins. 

The specificity of binding of compounds showing 
affinity for motilin receptors is shown by measuring the affinity of 

25 the compounds for recombinant cells expressing the cloned receptor 
or for membranes from these cells. Expression of the cloned 
receptor and screening for compounds that bind to motilin receptors 
or that inhibit the binding of a known, radiolabeled ligand of motilin 
receptors to these cells, or membranes prepared from these cells, 

30 provides an effective method for the rapid selection of compounds 
with high affinity for a motilin receptor. Such ligands need not 
necessarily be radiolabeled but can also be nonisotopic compounds 
that can be used to displace bound radiolabeled compounds or that 
can be used as activators in functional assays. Compounds identified 

35 by the above method are likely to be agonists or antagonists of 
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motilin receptors and may be peptides, proteins, or non- 
proteinaceous organic molecules. 

Such molecules are useful in treating a variety of gastric 
conditions, including gastric motility disorders (intrinsic myopathies 

5 or neuropathy), functional defects, disorders which are secondary to 
neurologic disorders including spinal cord transections, amyloidosis, 
collagen vascular disease (e.g. scleroderma), paraneoplastic 
syndromes, radiation-induced dysmotility, diabetes, infections, 
stress-related motiliy disorders, psychgenic/functional disorders, 

10 other drugs which affect motility (e.g. beta andrenergic drugs which 
may delay gastric emptying, cholinergic agents or opiates, or 
serotonin receptor antagonists), gastroparesis (diabetic or post- 
surgical), gastroesophageal reflux disease, constipation, chronic 
idiopathis pseudo-obstruction and acute fecal impaction, 

15 postoperative ileus, gallstones, infantile collie, preparation for 
colonoscopy and endoscopy, duodenal intubation , irritable bowel 
syndrome, non-ulcer dyspepsion, non-cardiac chest pain and 
diarrhea. 

The pharmacological characterization of the cloned MTL- 
20 Rl A in the aequorin bioluminescence assay in HEK-293 cells is shown 
in Figure 10 and in the [125l]-Tyr7 -human motilin binding assay 
(Figure 11). Motilin at concentrations as high as 10 |iM gave no 
bioluminescent response above background levels in cells that were not 
transfected with the MTL-R1 A cDNA expression vector. Similarly, 
25 non-transfected cells did not show appreciable binding of [125l]-Tyr7- 
human motilin. 

The rank order of potency for motilin, motilin peptide 
fragments and non-peptide molecules is consistent with experiments 
performed on native motilin receptors, from stomach or intestinal 
30 tissues. 

Due to the high degree of homology to GPCRs, the motilin 
receptor of this invention is believed to function similarly to GPCRs and 
have similar biological activity. They are useful in understanding the 
biological and physiological pathways involved in gastrointestinal 
35 motility. They may be. also used to scan for motilin agonists and 

antagonists; as in particular to test the specificity of identified ligands. 

-9- 
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The following, non-limiting Examples are presented to 
better illustrate the invention. 

5 FX AMPLE 1 

Sequence Comparison of MTL-R1 (GPR38) to human GHS-R, Puffer 
Fish 75E7 and Identification of Alternatively Spliced Forms. 

10 Inspection of the MTL-1 genomic DNA sequence revealed two 

potential mRNA splice sites corresponding to consensus boundaries for 
exon/intron junctions. An imperfect donor site (TGC/gt) was found at 
nucleotides 1929-31 (Fig. 1), a perfect donor site (CCG/gt) was found at 
nucleotides 2080-82, and a single perfect splice acceptor site (sequence 
15 [pyrimidine-rich stretch ag/TG]) was observed at nucleotides 2729-32. 
To determine which splice forms exist naturally, RACE (rapid 
amplification of cDNA ends) was performed on thyroid poly (A)+ 
mRNA and RT-PCR (reverse transcriptase polymerase chain reaction) 
was conducted on HEK-293/aeq 17 cells transfected with the MTL-1 
20 genomic DNA construct. Directional RACE reactions were conducted 
on thyroid poly (A)+ mRNA that had previously been shown by RNase 
protection assay to contain transcripts for MTL-1 R. Primer API 5'- 
CCA TCC TAA TAC GAC TCA CTA TAG GGC-3' (SEQ.ID.NO.:8) 
corresponds to the 5' end of the coding region including the 
25 presumptive Met initiation codon located within the cloning vector. 
5'RACEl corresponds to the 3' end of the MTL-1 R coding region 
including the translation termination codon TAA. 5' RACE1 : 5'-TTA 
TCC CAT CGT CTT CAC GTT AGC GCT TGT CTC-3' 
(SEQ.ID.NO.:9). 

30 RACE reactions were carried out on 1 ng of thyroid poly (A)+ 

mRNA using the Marathon cDN A amplification/advantage PCR kit as 
per the manufacturer's instructions (Clontech) using the following 
Touchdown PCR amplification conditions: 94°C for 1 min., 5 cycles of 
94°C for 30 sec. and 72°C for 4 min.; 5 cycles of 94°C for 30 sec. and 

35 70°C for 4 min.; 25 cycles of 94°C for 20 sec and 68°C for 4 min. An 
approximately 1 .4 kb amplified product was identified which hybridized 

- 10 - 
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with a 32p-labeled probe derived from the TM 2-4 region (3F/4R 
probe) of the MTL-R. This product was subcloned into PCR-Script 
vector (Invitrogen) and sequenced. 

As diagrammed in Figures 6A-C, DNA sequence analysis 
5 revealed two distinct sequences corresponding to alternative use of two 
splice donor sequences and a common splice acceptor sequence. These 
results were confirmed by transfecting the MTL-1 genomic construct 
containing the complete ORF interrupted by a single intron of 
approximately 0.7 kb into HEK-293/aeql7 cells. mRNA was the 

10 isolated (Poly (A) + Pure Kit, Ambion) and shown by Northern blot 
analysis using the 3F/4R probe to give two hybridizing bands: 2.4 kb 
containing the unspliced intron and approximately 1.4 kb encoding 
spliced forms. RT-PCR was then performed (Superscript 2 One-Step 
Kit, Life Technologies) on MTL-1 mRNA from transfected HEK- 

15 293/aeql7 cells using the forward primer 5' RACE1 and reverse primer 
3* RACE2 (TM5 region): 5'-CTG CCC TTT CTG TGC CTC AGC 
ATC CTC TAC-3' (SEQ.ID.NO.:10) 

An approximately 500 bp product was cloned (TA vector 
pCR2.2, Invitrogen), sequenced and shown to be a mixture of both 

20 splice forms. Assembly of the complete ORF for MTL-1 A without 
intronic sequence was performed by ligation of an exon 1 fragment 
(Not I digestion of a MTL-1 plasmid containing the intron in pCDNA- 
3) to pCDNA-3.1 containing a Not 1/EcoRl exon 2 fragment. 

To document protein expression, an MTL-1 A plasmid encoding a 

25 amino-terminal FLAG epitope was constructed by ligation of a Pme 1 
fragment from the MTL-1 A/pcDNA- LI vector into the EcoRV site of 
pFLAG/CMV-2 vector (Kodak Imaging Systems). Following 
transfection of this plasmid into HEK-293/aeql7 cells, a protein of the 
expected size (approximately 48 kDa) was detected in crude cell 

30 membranes by immunoblot analysis. 

EXAMPLE 2 
Identification of Ligand Specific to Motilin Receptor 

35 To identify a ligand for this orphan GPCR and to determine 

whether the full length, 7 TM domain GPR38-A is a functional GPCR, a 

- 1 1 - 
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high-throughput assay was developed which measures Ca2+ release with 
the bioluminescent Ca2+ sensitive aequorin reporter protein (capable of 
measuring ligand-induced IP3-coupled mobilization of intracellular 
calcium and concomitant calcium-induced aequorin bioluminescence). 

5 Expression of GPR38-A in cell membranes was confirmed using 
epitope-tagged protein which revealed a single protein species with a 
molecular weight of approximately 45,000 daltons. 

A broad set of peptide and non-peptide molecules was tested at a 
single concentration in transiently transfected HEK-293/aeql7 cells (100 

10 nM peptide, 10 U-M non-peptide). Significant bioluminescent responses 
(> 4-fold over background) were recorded for the peptide motilin and 
the non-peptide macrolide erythromycin, which was reported to be a 
competitive agonist at motilin receptors. Full dose-response curves 
confirmed this observation. The half-maximal effective concentration 

15 (EC50) for human/porcine motilin was 2.1 +/- 0.5 nM motilin whereas 
erythromycin was considerably less potent (2000 +/- 210 nM; as 
expected from studies performed on native motilin receptors). 

The signal tranduction pathway for the cloned GPR38-A motilin 
receptor (MTL-R1 A) is through activation of phospholipase C, which 

20 has been reported for native motilin receptors. Direct radioligand 

binding studies with [125l] human motilin on cell membranes prepared 
from transfected cells show that MTL-R1 A confers high affinity and 
specific binding (Kd= 0.1 nM; B max = 240 fmol/mg protein) which are 
strongly G protein coupled (> 80% inhibition of binding with 100 nM 

25. GTPyS) . 

EXAMPLE 3 
Functional Activation of the MTL-1 A Receptor 

30 

The aequorin bioluminescence assay is a reliable test for 
identifying G protein-coupled receptors which couple through the Get 
protein subunit family consisting of G q and G 1 1 which leads to the 
activation of phospholipase C, mobilization of intracellular calcium and 
35 activation of protein kinase C. Measurement of MTL-1 A expression in 
the aequorin-expressing stable reporter cell line 293-AEQ17 (Button, 
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D. et. al.,1993 Cell Calcium 14: p. 663-671.) was performed using a 
Luminoskan RT luminometer (Labsystems Inc., Gaithersburg, MD). 

293-AEQ17 cells (8 x 105 cells plated 18 hrs. before transfection 
in a T75 flask) were transfected with 22 \ig of human MTL-R1 A 

5 plasmid DNA: 264 jig lipofectamine. Following approximately 40 
hours of expression the apo-aequorin in the cells was charged for 4 
hours with coelenterazine (10 ^iM) under reducing conditions (300 |iM 
reduced glutathione) in ECB buffer (140 mM NaCl, 20 mM KC1, 20 
mM HEPES-NaOH [pH=7.4], 5 mM glucose, 1 mM MgCl2, 1 mM 

10 CaCl2> 0.1 mg/ml bovine serum albumin). The cells were harvested, 
washed once in ECB medium and resuspended to 500,000 cells/ml. 100 
(il of cell suspension (corresponding to 5x104 cells) was then injected 
into the test plate, and the integrated light emission was recorded over 
30 seconds, in 0.5 second units. 20 |iL of lysis buffer (0.1% final 

15 Triton X-100 concentration) was then injected and the integrated light 
emission recorded over 10 seconds, in 0.5 second units. The "fractional 
response" values for each well were calculated by taking the ratio of the 
integrated response to the initial challenge to the total integrated 
luminescence including the Triton X-100 lysis response. 

20 

EXAMPLE 4 

Binding of [1251] Human Motilin to Crude Membranes from HEK-293 
Cells transfected with the MTL-R1 A cDNA. 

25 The binding of [125l] human motilin to crude membranes 

prepared from HEK-293/aeql7 cell transfectants was performed as 
follows. Crude cell membranes were prepared on ice, 48 hrs. post- 
transfection. Each T-75 flask was washed twice with 10 ml of PBS, 
once with 1 ml homogenization buffer (50 mM Tris-HCl [pH 7.4], 10 

30 mM MgCl2- 10 ml of homogenization buffer was added to each flask, 
cells were removed by scraping and then homogenized using a Polytron 
device (Brinkmann, Syosset, NY; 3 bursts of 10 sec. at setting 4). The 
homogenate was centrifuged for 20 min. at 1 1 ,000 x g at 0°C and the 
resulting crude membrane pellet (chiefly containing cell membranes and 

35 nuclei) was resuspended in homogenization buffer supplemented with 
1 .5 % BSA (0.5 ml T75 flask) and kept on ice. 
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Binding reactions were performed at 20°C for 1 hx. in a total 
volume of 0.5 ml containing: 0.1 ml of membrane suspension 
(approximately 1 \ig protein), 10 ul of 125]-human motilin, 10 ul of 
competing drug and 380-390 \il of homogenization buffer. Bound 

5 radioligand was separated by rapid vacuum filtration (Brandel 48-well 
cell harvester) through GF/C filters pretreated for 1 hr. with 0.5% 
polyethylenimine. After application of the membrane suspension to the 
filter, the filters were washed 3 times with 3 ml each of ice-cold 50 mM 
Tris-HCl [pH 7.4], 1 0 mM MgCl2, and the bound radioactivity on the . 

10 filters was quantitated by gamma counting. Specific binding (> 90% of 
total) is defined as the difference between total binding and non-specific 
binding conducted in the presence of 100 nM unlabeled human motilin. 
Competition binding data were analyzed by a nonlinear curve-fitting 
program (Prism V, version 2.0; GraphPad Software, San Diego, CA). 

15 Results shown are the means (h-/- SEM) of triplicate determinationst 
Human motilin was radiolabeled with 1 25] at 7Tyr to a specific activity 
of approximately 2000 Ci/mmol (Woods Assay, Portland, OR). 

Structure-function analysis suggest that the motilin peptide 
minimally contains an N-terminal region (amino acids 1-7) essential for 

20 activity, linked to a C-terminal alpha helical domain which stabilizes the 
N-terminal active site region activity. The rank order of potency of 
several motilin peptide analogs in the MTL1-A functional and binding 
assays correlates with their reported potency measured by in vitro 
contractility assays (Table 1) performed on native motilin receptors in 

25 intestinal tissue. These results are summarized in Table 1 below. 



Cloned MTL-1A Receptor 
(human) 



Ligand 


Aequorin 

Assay 
(EC50 nM) 


[1251] hmotilin 
binding 
(IC50,nM) 


human motilin 
(MTL) 


2.1 


0.5 


erythromycin 


2000 


427 


roxithromycin 


1950 


613 


metoclopramide 


> 10,000 


>1 0,000 


cisapride 


> 10.000 


> 10,000 
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canine motilin 


4.4 


0.2 


Leu 13 MTL 


3.9 


0.2 


1-11 MTL 


138 


127 


1-12 MTL 


72 


14 


1-13 MTL 


3.8 


0.9 


1-19 MTL 


4.1 


0.3 


10-22 MTL 


> 10,000 


1100 



The unrelated prokinetic agents metoclopromide and cisapride 
which have affinity for dopamine and/or 5-HT receptors. were inactive, 
even at high (10 ^lM) doses. 

EXAMPLE 5 
Southern Blot Analysis 

A genomic Southern blot (EcoRI and BamHl -digested DNA, 10 
|ig/lane) was hybridized with the ORF of MTL-1 A. Post- 
hybridizational washing stringencies were at 55°C 4 X SSPE after which 
the filters were dried and exposed to X-ray film for 5 days at -70°C. 
Lambda Hind III DNA markers were (in kb), 23.1, 9.4, 6.6, 4.4, 2.3, 
2.1. Southern blot analysis conducted in a variety of mammalian and 
non-mammalian species revealed a simple hybridization pattern 
consistent with a single, conserved gene encoding MTL-1 A. 

EXAMPLE 6 
Puffer Fish Clone 75E7 

Screening of a puffer fish genomic library identified a single 
clone (75E7) containing an open reading frame of 363 amino acids with 
approximately 54% protein sequence identity to the human MTL-R1 A 
In addition, 75E7 has a similar intron-exon structure to the human 
MTL-R1A. 75E7 may be the ortholog of the human MTL-R1 A. 
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EXAMPLE 7 
Expression of the MTL-R1 A Gene 

5 Transcripts of MTL-1 A were detected by RNase Protection 

Assay (RPA). Synthesis of high-specific activity radiolabeled antisense 
probes and the RPA was conducted using a kit (MAXI script and 
HybSpeed RPA kits; Ambion, Austin, TX) essentially as described by 
the manufacturer. The anti-sense cRNA MTL-1 A probe was 

10 synthesized from a cDNA template encompassing rit 1234 to 1516 of the 
human MTL-1 A inserted behind the T7 promoter in pLitmus 28 (New 
England Biolabs, Beverly, MA). Digestion of the construct with Stu I 
generated a cRNA transcript approximately 340 nt in size with 

approximately 60 nt of vector sequence. Input poly A 4 " mRNA 
15 (Clontech, Palo Alto, CA) was 5 g for the MTL-1 A probe and 0.1 ng 
for a control human actin probe. Precipitated fragments were subjected 
to slab-gel electrophoresis (42 cm x 32 cm x 0.4 mm) in 5 % 
acrylamide/Tris-borate-EDTA buffer containing 8 M urea. The gels 
were fixed, dried and autoradiographed on film (X-Omat; Kodak, 
20 Rochester, NY) for 1-3 days (MTL-1 A) or 2 hrs. (actin). 

The distribution profile of MTL-1 A mRNA was examined in a 
panel of Gl and non-GI human tissues. MTL-1 A mRNA could be 
detected in whole stomach (most prominently), thyroid, and bone 
marrow but was absent from several brain regions and other non-CNS 
25 tissues. 
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WHAT IS CLAIMED: 

1. A motilin receptor, substantially free from receptor- 
associated proteins. 

2. A motilin receptor according to Claim 1 which is 

human. 



3. A motilin receptor according to Claim 2 which is 
10 MTL-R1A having the amino acid sequence SEQJD.NO.:3. 

4. A motilin receptor according to Claim 3 having the 
nucleic acid sequence SEQ.BD.NO.:2. 

15 5- A motilin receptor according to Claim 2 which is 

MTL-R1B having the amino acid sequence SEQ.ID.NO.:5. 



20 



6. A motilin receptor according to Claim 5 having the 
nucleic acid sequence SEQ.IEXNO.:4. 

7. A motilin receptor according to Claim 6 which is 
75E7 having the amino acid sequence SEQ.ID.NO.:7. 



8. A method for determining whether a ligand is 
25 capable of binding to a motilin receptor comprising: 

(a) transfecting test cells with an expression vector 
encoding motilin receptor; 

(b) exposing the test cells to the ligand; 

(c) measuring the amount of binding of the ligand 
30 to the motilin receptor; 

(d) comparing the amount of binding of the ligand 
to the motilin receptor in the test cells with the amount of binding of the 
ligand to control cells that have not been transfected with the motilin 
receptor 
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is greater than the amount of binding of the ligand to the control cells, 
then the substance is capable of binding to motilin receptor. 
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TTGAAATTATCTGGTCACTGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCAGTTTGGGAGGTCGA 

GGCGGGTGGACCACCTGGGGTCAGGAGTTCGAGACCAGGCTGGCCAACATGGCGAAACCCTGACTACA 

CAAAAAACACAAAATTTAGCCGGGGCTTGGGCGCTCCTGTGCTCCCAGCTACTCAGGAGGCTGAGGTG 

GGAGGACTGCTTGAGCCTGGGAGGTCGAGGCTGCAGTGAGCTGTGATCGCGCCACTTAAACTCCAGCC 

TGGACGACAGTGAGACCCTGTCTCMGMGAAAAAAAGAAAGAAAGAAAGAAAAAAAGAAAAAAAAGA 

MTTATTTGGTCAATTATATGGTCAGCTCCCTCCACCACTCGCGAATTTACAGAAGAGGAGAACTGGG 

CTGGGCGAGACCAGGACTAGCCCAAGATTACACAAGTTACTCGGTTGTAGAGCCAGGATTAGACAGGA 

GAGGCTCTAGATTCTGGTCTAGACTCCCCTCCTATTATTTAGCATTATGGCTTCCTGAGGATTACCAT 

GAGCCCTCCTCCACCGTCAAGCGGCAGCTACCAGCCACCAGACCAGATCCCTTCGAAGGTGCCCGGAG 

TACCAGACTGACAAAAGCGCCCGTACAGTGCTCAGTCCTGTAACCAAAGCTGTCTAGGGTGCAGACAT 

CGCTCACCGGACCGGGTAGGGCTCGTGCGCTAAGGGCGCCGGGTATTCCAGTTAGTGGAGAGGGAAGC 

GCCCTGGAACTGCATGGGCCCGGGAGAGGGCGCGGGAGCGGAGCATGGCCGGGCCGGGGCGGGCCGCG 

GCCGTGGGCGGAGACTGCGCGCAGCTAGCTCGGGAGCGCCTCGGAGCC QCCCCGCAGAGCCGCTTCT 

CGCGCCCCGCAGCGCAGCGCAGCGCTCCGCCGTCTGACCTGCCGCGCCCGCAGCGTGCGGGCTGGGAA 

AGGAGGCGCTCACCGAGAGGGACCACGCGCCAGGCTCCCAGCCCGACCCGGGACGCGGCGGCCGCGCG 

GAGCACCCATGGGCAGCCCCTGGAACGGCAGCGACGGCCCCGAGGGGGCGCGGGAGCCGCCGTGGCCC 

GCGCTGCCGCCTTGCGACGAGCGCCGCTGCTCGCCCTTTCCCCTGGGGGCGCTGGTGCCGGTGACCGC 

TGTGTGCCTGTGCCTGTTCGTCGTCGGGGTGAGCGGCAACGTGGTGACCGTGATGCTGATCGGGCGCT 

ACCGGGACATGCGGACCACCACCAACTTGTACCTGGGCAGCATGGCCGTGTCCGACCTACTCATCCTG 

CTCGGGCTGCCGTTCGACCTGTACCGCCTCTGGCGCTCGCGGCCCTGGGTGTTCGGGCCGCTGCTCTG 

CCGCCTGTCCCTCTACGTGGGCGAGGGCTGCACCTACGCCACGCTGCTGCACATGACCGCGCTCAGCG 

TCGAGCGCTACCTGGCCATCTGCCGCCCGCTCCGCGCCCGCGTCTTGGTCACCCGGCGCCGCGTCCGC 

GCGCTCATCGCTGTGCTCTGGGCCGTGGCGCTGCICTCTGCCGGTCCCTTCTTGTTCCTGGTGGGCGT 

CGAGCAGGACCCCGGCATCTCCGTAGTCCCGGGCCTCAATGGCACCGCGCGGATCGCCTCCTCGCCTC 

TCGCCTCGTCGCCGCCTCTCTGGCTCTCGCGGGCGCCACCGCCGTCCCCGCCGTCGGGGCCCGAGACC 

GCGGAGGCCGCGGCGCTGTTCAGCCGCGAATGCCGGCCGAGCCCCGCGCAGCTGGGCGCGCTGCGTGT 

CATGCTGTGGGTCACCACCGCCTACTTCTTCCTGCCCTTTCTGTGCCTCAGCATCCTCTACGGGCTCA 

TCGGGCGGGAGCTGTGGAGCAGCCGGCGGCCGCTGCGAGGCCCGGCCGCCTCGGGGCGGGAGAGAGGC 

CAeCGGCAGACCGTCCGCGTCCTGCgtaagtggagccgccgtggttccaaagacgcctgcctgcagtc 

cgccccgccggggaccgcgcaaacgctccctccccttcccctgctcgcccagctctgggcgccgcttc 

cagctcccttcctatttcgattccagcctccacccgccggtcattcccatcccccgagaaaaccatgt 

cctgtcccccaggagctctgggggaccccagggcgctttgagggtgggatccccggatccgattcagt 

aaccagcagtgcttttccagagcctctgagaccagaaaggagagttggtaattcttaatccaaccacc 

tgttagatgccacaaatgaggagtcctcacagtgctcttgagaagacgagggagatttcattaagcta 

aaattttttatttaatgttaagtgatgctgaaggctaaagtaaaccttgctcgtatcaaaaagtaaag 

attgtgcagacctgttgtagaattcttttcaacagagaacagaaaacttgtctccgaagtgggtttgt 

ggaaggaagcctgccaaggcggcttgttcagagaaattgctccttctggtttatgtccagccttgata 

acacatatgggagcctactatgcagttttaaagcaagtatccatgcagcctgcagcctggtcattttt 

tctggggtgaggatctgcctaggtagaagttttctctaatttattttgctgttacttgttattgcaga 

tggttccttgtcggggtggggggtttatttgcttcccaatgcttttgttaatcccggtgctgtgtctt 

atgttgcagTGGTGGTGGTTCTGGCATTTATAATTTGCTGGTTGCCCTTCCACGTTGGCAGAATCATT 

TACATAAACACGGAAGATTCGCGGATGATGTACTTCTCTCAGTACTTTAACATCGTCGCTCTGCAACT 

TTTCTATCTGAGCGCATCTATCAACCCAATCCTCTACAACCTCATTTCAAAGAAGTACAGAGCGGCGG 

CCTTTAAACTGCTGCTCGCAAGGAAGTCCAGGCCGAGAGGCTTCCACAGAAGCAGGGACACTGCGGGG 

GAAGTTGCAGGGGACACTGGAGGAGACACGGTGGGCTACACCGAGACAAGCGCTAACGTGAAGACGAT 

GGGATAA 

FIG.1 
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ATGGGCAGCCCCTGGAAC6GCAGCGACGGCCCCGAGGGG6CGCGGGAGCCGCCGTGGCCCGCGCTG 

CCGCCTTGCGACGAGCGCCGCTGCTCGCCCTTTCCCCTGGGGGCGCTGGTGCCGGTGACCGCTGTG 

TGCCTGTGCCTGTTCGTCGTCGGGGTGAGCGGCAACGTGGTGACCGTGATGCTGATCGGGCGCTAC 

CGGGACATGCGGACCACCACCAACTTGTACCTGGGCAGCATGGCCGTGTCCGACCTACTCATCCTG 

CTCGGGCTGCCGTTCGACCTGTACCGCCTCTGGCGCTCGCGGCCCTGGGTGTTCGGGCCGCTGCTC 

TGCCGCCTGTCCCTCTACGTGGGCGAGGGCTGCACCTACGCCACGCTGCTGCACATGACCGCGCTC. 

AGCGTCGAGCGCTACCTGGCCATCTGCCGCCCGCTCCGCGCCCGCGTCTTGGTCACCCGGCGCCGC 

GTCCGCGCGCTCATCGCTGTGCTCTGGGCCGTGGCGCTGCTCTCTGCCGGTCCCTTCTTGTTCCTG- 

GTGGGCGTCGAGCAGGACCCCGGCATCTCCGTAGTCCCGGGCCTCAATGGCACCGCGCGGATCGCC 

TCCTCGCCTCTCGCCTCGTCGCCGCCTCTCTGGCTCTCGCGGGCGCCACCGCCGTCCCCGCCGTCG 

GGGCCCGAGACCGCGGAGGCCGCGGCGCTGTTCAGCCGCGAATGCCGGCCGAGCCCCGCGCAGCTG 

GGCGCGCTGCGTGTCATGCTGTGGGTCACCACCGCCTACTTCTTCCTGCCCTTTCTGTGCCTCAGC 

ATCCTCTACGGGCTCATCGGGCGGGAGCTGTGGAGCAGCCGGCGGCCGCTGCGAGGCCCGGCCGCC 

TCGGGGCGGGAGAGAGGCCACCGGCAGACCGTCCGCGTCCTGCTGGTGGTGGTTCTGGCATTTATA 

ATTTGCTGGTTGCCCTTCCACGTTGGCAGAATCATTTACATAAACACGGAAGATTCGCGGATGATG 

TACTTCTCTCAGTACTTTAACATCGTCGCTCTGCAACTTTTCTATCTGAGCGCATCTATCAACCCA 

ATCCTCTACAACCTCATTTCAAAGAAGTACAGAGCGGCGGCCTTTAAACTGCTGCTCGCAAGGAAG 

TCCAGGCCGAGAGGCTTCCACAGAAGCAGGGACACTGCGGGGGAAGTTGCAGGGGACACTGGAGGA 

GACACGGTGGGCTACACCGAGACAAGCGCTAACGTGAAGACGATGGGATAA 

FIG. 2 
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MGSPWNGSDGPEGAR'EPPWPALPPCDERRCSPFPLGALVPVTAVCLCLFVVGVSGNVVTVMLIGRY 
RDMRTTTNLYLGSMAVSDLLILLGLPFDLYRLWRSRPWVFGPLLCRLSLYVGEGCTYATLLHMTAL 
SVERYLAICRPLRARVLVTRRRVRALIAVLWAVALLSAGPFLFLVGVEQDPGISVVPGLNGTARIA 
SSPLASSPPLWLSRAPPPSPPSGPETAEAAALFSRECRPSPAQLGALRVMLWVTTAYFFLPFLCLS 
I L YGL IGRELWSSRRPLRGPAASGRERGHRQTVRVLL V VVLAF I I CWLPFHVGRI I Y I NTEDSRWI 
YFSQYFNIVALQLFYLSASINPILYNLISKXYRAAAFKLLLARKSRPRGFHRSRDTAGEVAGDTGG 
DTVGYTETSANVKTMG 

FIG. 3 
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AT6GGCAGCCCCTGGAACGGCAGCGACGGCCCCGAGGGGGCGCGGGAGCCGCGGTGGCCCGCGCTG 

CCGCCTTGCGACGAGCGCCGCTGCTCGCCCTTTCCCCTGGG&GCGCTGGTGCCGGTGACCGCTGTG 

TGCCTGTGCCTGTTCGTCGTCGGGGTGAGCGGCAACGTGGTGACCGTGATGCTGATCGGGCGCTAC 

CGGGACATGCGGACCACCACCAACTTGTACCTGGGCAGCATGGCCGTGTCCGACCTACTCATCCTG 

CTCGGGCTGCCGTTCGACCTGTACCGCCTCTGGCGCTCGCGGCCCTGGGTGTTCGGGCCGCTGCTC 

TGCCGCCTGTCCCTCTACGTGGGCGAGGGCTGCACCTACGCCACGCTGCTGCACATGACCGCGCTC. 

AGCGTCGAGCGCTACCTGGCCATCTGCCGCCCGCTCCGCGCCCGCGTCTTGGTCACCCGGCGCCGC 

GTCCGCGCGCTCATCGCTGTGCTCTGGGCCGTGGCGCTGCTCTCTGCCGGTCCCTTCTTGTTCCTG . 

GTGGGCGTCGAGCAGGACCCCGGCATCTCCGTAGTCCCGGGCCTCAATGGCACCGCGCGGATCGCC 

TCCTCGCCTCTCGCCTCGTCGCCGCCTCTCTGGCTCTCGCGGGCGCCACCGCCGTCCCCGCCGTCG - 

GGGCCCGAGACCGCGGAGGCCGCGGCGCTGTTCAGCCGCGAATGCCGGCCGAGCCCCGCGCAGCTG 

GGCGCGCTGCGTGTCATGCTGTGGGTCACCACCGCCTACTTCTTCCTGCCCTTTCTGTGCCTCAGC 

ATCCTCTACGGGCTCATCGGGCGGGAGCTGTGGAGCAGCCGGCGGCCGCTGCGAGGCeCGGCCGCC 

TCGGGGCGGGAGAGAGGCCACCGGCAGACCGTCCGCGTCCTGCGTAAGTGGAGCCGCCGTGGTTCC 

AAAGACGCCTGCCTGCAGTCCGCCCCGCCGGGGACCGCGCAAACGCTGGGTCCCCTTCCCCTGCTC 

GCCCAGCTCTGGGCGCCGCTTCCAGCTCCCTTTCCTATTTCGATTCCAGCCTCCACCCGCCGTGGT 

GGTGGTTCTGGCATTTATAATTTGCTGGTTGCCCTTCCACGTTGGCAGAATCATTTACATAAACAC 

GGAAGATTCGCGGATGATGTACTTCTCTCAGTACTTTAACATCGTCGCTCTGCAACTTTTCTATCT 

GAGCGCATCTATCAACCCAATCCTCTACAACCTCATTTCAAAGAAGTACAGAGCGGCGGCCTTtAA 

ACTGCTGCTCGCAAGGAAGTCCAGGCCGAGAGGCTTCCACAGAAGCAGGGACACTGCGGGGGAAGT 

TGCAGGGGACACTGGAGGAGACACGGTGGGCTACACCGAGACAAGCGCTAACGTGAAGACGATGGG 

ATAA 



FIG. 4 
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MGSPWNGSDGPEGAREPPWPALPPCDERRCSPFPLGALVPVTAVCLCLFVVGVSGNVVIVMLIGRY 
RDMRTTTNLYLGSMAVSDLLILLGLPFDLYRLWRSRPWVFGPLLCRLSLYVGEGCTYATLLHMTAL 
SVERYLAICRPLRARVLVTRRRVRALIAVLWAVALLSAGPFLFLVGVEQDPGISVVPGLNGTARIA 
SSPLASSPPLWLSRAPPPSPPSGPETAEAAALFSRECRPSPAQLGALRVMLWVTTAYFFLPFLCLS 
ILYGLIGRELWSSRRPLRGPAASGRERGHRQTVRVLRKWSRRGSKDACLQSAPPGTAQTLGPLPLL 
AQLWAPLPAPFPISIPASTRRGGGSGIYNLLVALPRWQNHLHKHGRFADDVLLSVL 



FIG. 5 
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ATGCCCTGGACCAGACCCCAGGTGGACCTCCATGCTGCTGCAGCAGAGACCATG6ACCAGTACACG 

ACGGACGACCACCACTACGAGGGCTCCCTCTTCCCCGCGTCCACCCTCATCCCCGTCACGGTCATC 

TGCATCCTCATCTTCGTGGTCGGCGTGACCGGCAACACCATGACCATCCTCATCATCCAGTACTTC 

AAGGACATGAAGACCACCACCAACCTGTACCTGTCCAGCATGGCCGTGTCCGACCTCGTCATCTTC 

CTCTGCCTGCCCTTCGACCTGTACCGCCTGTGGAAGTACGTGCCGTGGCTGTTCGGCGAGGCCGTG 

TGCCGCCTCTACCACTACATCTTCGAAGGCTGCACGTCGGCCACCATCCTCCACATCACGGCCCTG 

AGCATCGAGCGCTACCTGGCCATCAGCTTCCCCCTCAGGAGCAAGGTGATGGTGACCAGGAGAAGG 

GTCCAGTACATCATCCTGGCCCTGTGGTGCTTCGCCCTGGTGTCGGCCGCTCCCACGCTCTTCCTG 

GTCGGGGTGGAGTACGACAACGAGACGCACCCCGACTACAACACGGGCCAGTGCAAGCACACGGGC - 

TACGCCATCAGCTCGGGGCAGCTGCACATCATGATCTGGGTGTCCACCACCTACTTCTTCTGCCCG 

ATGCTGTGTCTCCTCTTCCTCTACGGCTCCATCGGGTGCAAGCTGTGGAAGAGCAAGAACGACCTG 

CAGGGCCCGTGCGCCCTGGCCCGCGAGAGGTCGCACAGGCAAACGGTGAAGATCCTGGTGGTGGTG 

GTGCTGGCCTTCATCATCTGCTGGCTGCCCTACCACATCGGCAGGAACCTGTTCGCCCAGGTGGAC 

GACTACGACACGGCCATGCTCAGCCAGAATTTCAACATGGCCTCCATGGTGCTCTGCTACCTCAGC 

GCCTCCATCAACCCCGTCGTCTACAACCTGATGTCGAGGAAGTACCGGGCCGCCGCCAAGCGCCTC 

TTCCTGCTCCACCAGAGACCCAAGCCGGCCCACCGGGGGCAGGGGCAGTTTTGCATGATCGGCCAC 

AGCCCCACCCTGGACGAGAGCCTGACGGGGGTGTGA 

FIG. 7 
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MPWTRPQVDLHAAAAETMDQYTTDDHHYEGSLFPASTL I PVTV I C I L I F W GVTGNT 

MTILIIQYFKDMKTTTNLYLSSMAVSDLVIFLCLPFDLYRLWKYVPWLFGEAVCRLY 

HY I FEGCTSATI LH ITALS I ERYLAI SFPLRSKVMVTRRRVQY I I LALWCFAL VSAA 

PTLFLVGVEYDNETHPDYNTGQCKHTGYAISSGQLHIMIWVSTTYFFCPMLCLLFLY 

GSIGCKLWKSKNDLQGPCALARERSHRQTVKILVVVVLAFIICWLPYHIGRNLFAQV 

DDYDTAMLSQNFNMASMVLCYLSASINPVVYNLMSRKYRAAAKRLFLLHQRPKPAHR 

GQGQFCMIGHSPTLDESLTGV 

FIG. 8 



BNSDOC!D:<WO 9964436A1 I > 



WO 99/64436 PCT/US99/12773 

11/13 

pu75E7 1 MPWTRPQVDLHAAAAETMDQYTTDDHHYEGSLFPASTLIPVTVICILI 48 

II III I. II MIT:!: : 

huMTLR 1 MGSPWNGS . . DGPEGAREPPWPALPPCDERRCSPFPLGALVPVTAVCLCL 48 

49 FWGVTGNTMT I L 1 1 QYFKDMKTTTNL YLSSMAVSDLV I FLCLPFDL YRL 98 

Mill. I I I- I -MM Mill I I Mill I I III II II I I 

49 FVVGVSGNWTVMLIGRYRDMRTTTNLYLGSMAVSDLLILLGLPFDLYRL 98 

99 WKYVPWLFGEAVCRLYHYIFEGCTSATILHITALSIERYLAISFPLRSKV 148 

I lilt -III I Mil Ihll-IIM-IIMM II I • • I 

99 WRSRPWVFGPLLCRLSLYVGEGCTYATLLHMTALSVERYLAICRPLRARV 148 

149 MVTRRRVQY 1 1 LALWCFALVSAAPTLFLVGVE YD 182 

MUM I II IMI I II I II I I I 

149 LVTRRRVRALIAVLWAVALLSAGPFLFLVGVEQDPGISVVPGLNGTARIA 198 

183 NETHPDYNTGQCKHTGYAISS GQLHIM 209 

I -I I: I I I : I 

199 SSPLASSPPLWLSRAPPPSPPSGPETAEAAALFSRECRPSPAQLGALRVM 248 

210 IWVSTTYFFCPMLCLLFLYGSIGCKLWKSKNDLQGPCALARERSHRQTVK 259 

Mil III I III Mill M |: III I III IMIM 
249 LWVTTAYFFLPFLCLSILYGLIGRELWSSRRPLRGPAASGRERGHRQTVR 298 

260 ILVVVVLAFIICWLPYHIGRNLFAQVDDYDTAMLSQNFNMASMVLCYLSA 309 

M I I II II II II II M : I I :: : I M II- -: I MM 

299 VLLVVVLAFI ICWLPFHVGRI IYINTEDSRMMYFSQYFNIVALQLFYLSA 348 

310 SINPVVYNIMSRRYRAAAKRLFLLHQ.RPKPAHRGQ. . .GQFCMIGHSPT 355 

IMM III IMIIMI : I I ■ II- II • M I 

349 S I NP I LYNLISKKYRAAAFKLLLARKSRPRGFHRSRDTAGEVAGDTGGDT 398 

356 LDESLTGV 363 

399 VGYTETSANVKTMG 412 

FIG. 9 
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SEQUENCE LISTING 
(1) GENERAL INFORMATION: 
( i ) APPLICANT : Merck & Co . , Inc . 

(ii) TITLE OF INVENTION: CLONING AND IDENTIFICATION 

OF THE MOTILIN RECEPTOR 

(iii) NUMBER OF SEQUENCES: 15 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Merck & Co. , Inc. 

(B) STREET: P.O. Box 2000, 126 E. Lincoln Ave. 

(C) CITY: Rahway 

(D) STATE: NJ 

(E) COUNTRY: USA 

(F) ZIP: 07065-0900 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: Windows 

(D) SOFTWARE: FastSEQ for Windows Version 2.0b 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/089,098 

(B) FILING DATE: 12-JUN-1998 



(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Giesser, Joanne M 

(B) REGISTRATION NUMBER: 32,838 

(C) REFERENCE/ DOCKET NUMBER: 20251 PCT 

<ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 732-594-3046 

(B) TELEFAX: 732-594-4720 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3066 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TTGAAATTAT CTGGTCACTG CCGGGCGCGG TGGCTCACGC CTGTAATCCC AGCACTTTGG 6 0 

GAGGTCGAGG CGGGTGGACC ACCTGGGGTC AGGAGTTCGA GACCAGGCTG GCCAACATGG 12 0 

CGAAACCCTG ACTACACAAA AAACACAAAA TTTAGCC GGG GCTTGGGCGC TCCTGTGCTC 180 

CCAGCTACTC AGGAGGCTGA GGTGGGAGGA CTGCTTGAGC CTGGGAGGTC GAGGCTGCAG 24 0 

TGAGCTGTGA TCGCGCCACT TAAACTCCAG CCTGGACGAC AGTGAGACCC TGTCTCAAGA 30 0 

AGAAAAAAAG AAAGAAAGAA AGAAAAAAAG AAAAAAAAGA AATTATTTGG TCAATTATAT 36 0 

GGTCAGCTCC CTCCACCACT CGCGAATTTA CAGAAGAGGA GAACTGGGCT GGGCGAGACC 42 0 

AGGACTAGCC CAAGATTACA CAAGTTACTC GGTTGTAGAG CCAGGATTAG ACAGGAGAGG 48 0 

CTCTAGATTC TGGTCTAGAC TCCCCTCCTA TTATTTAGCA TTATGGCTTC CTGAGGATTA 54 0 

CCATGAGCCC TCCTCCACCG TCAAGCGGCA GCTACCAGCC ACCAGACCAG ATCCCTTCGA 60 0 

AGGTGCCCGG AGTACCAGAC TGACAAAAGC GCCCGTACAG TGCTCAGTCC TGTAACCAAA 66 0 

GCTGTCTAGG GTGCAGACAT CGCTCACCGG ACCGGGTAGG GCTCGTGCGC TAAGGGCGCC 72 0 

GGGTATTCCA GTTAGTGGAG AGGGAAGCGC CCTGGAACTG CATGGGCCCG GGAGAGGGCG 78 0 

CGGGAGCGGA GCATGGCCGG GCCGGGGCGG GCCGCGGCCG TGGGCGGAGA CTGCGCGCAG 84 0 

CTAGCTCGGG AGCGCCTCGG AGCCCACCCC GCAGAGCCGC TTCTCGCGCC CCGCAGCGCA 90 0 

GCGCAGCGCT CCGCCGTCTG ACCTGCCGCG CCCGCAGCGT GCGGGCTGGG AAAGGAGGCG 960 

CTCACCGAGA GGGACCACGC GCCAGGCTCC CAGCCCGACC CGGGACGCGG CGGCCGCGCG 102 0 

GAGCACCCAT GGGCAGCCCC TGGAACGGCA GCGACGGCCC CGAGGGGGCG CGGGAGCCGC 108 0 

CGTGGCCCGC GCTGCCGCCT TGCGACGAGC GCCGCTGCTC GCCCTTTCCC CTGGGGGCGC 114 0 

TGGTGCCGGT GACCGCTGTG TGCCTGTGCC TGTTCGTCGT CGGGGTGAGC GGCAACGTGG 12 00 

TGAC CGTGAT GCTGATCGGG CGCTACCGGG ACATGCGGAC CACCACCAAC TTGTACCTGG 126 0 

GCAGCATGGC CGTGTCCGAC CTACTCATCC TGCTCGGGCT GCCGTTCGAC CTGTACCGCC 132 0 

TCTGGCGCTC GCGGCCCTGG GTGTTCGGGC CGCTGCTCTG CCGCCTGTCC CTCTACGTGG 13 8 0 

GCGAGGGCTG CACCTACGCC ACGCTGCTGC ACATGACCGC GCTCAGCGTC GAGCGCTACC 144 0 

TGGCCATCTG CCGCCCGCTC CGCGCCCGCG TCTTGGTCAC CCGGCGCCGC GTCCGCGCGC 1500 

TCATCGCTGT GCTCTGGGCC GTGGCGCTGC TCTCTGCCGG TCCCTTCTTG TTCCTGGTGG 1560 

GCGTCGAGCA GGACCCCGGC ATCTCCGTAG TCCCGGGCCT CAATGGCACC GCGCGGATCG 162 0 

CCTCCTCGCC TCTCGCCTCG TCGCCGCCTC TCTGGCTCTC GCGGGCGCCA CCGCCGTCCC 168 0 

CGCCGTCGGG GCCCGAGACC GCGGAGGCCG CGGCGCTGTT CAGCCGCGAA TGCCGGCCGA 174 0 

GCCCCGCGCA GCTGGGCGCG CTGCGTGTCA TGCTGTGGGT CACCACCGCC TACTTCTTCC 180 0 

TGCCCTTTCT GTGCCTCAGC ATCCTCTACG GGCTCATCGG GCGGGAGCTG TGGAGCAGCC 1860 

GGCGGCCGCT GCGAGGCCCG GCCGCCTCGG GGCGGGAGAG AGGCCACCGG CAGACCGTCC 192 0 

GCGTCCTGCG TAAGTGGAGC CGCCGTGGTT CCAAAGACGC CTGCCTGCAG TCCGCCCCGC 198 0 

CGGGGACCGC GCAAACGCTG GGTCCCCTTC CCCTGCTCGC CCAGCTCTGG GCGCCGCTTC 204 0 

CAGCTCCCTC CTATTTCGAT TCCAGCCTCC ACCCGCCGGT ACTTCCCATC CCCCGAGAAA 210 0 

ACCATGTCCT GTCCCCCAGG AGCTCTGGGG GACCCCAGGG CGCTTTGAGG GTGGGATCCC 216 0 

CGGATCCGAT TCAGTAACCA GCAGTGCTTT TCCAGAGCCT CTGAGACCAG AAAGGAGAGT 2220 

TGGTAATTCT TAATCCAACC ACCTGTTAGA TGCCACAAAT GAGGAGTCCT CACAGTGCTC 2280 

TTGAGAAGAC GAGGGAGATT TCATTAAGCT AAAATTTTTT ATTTAATGTT AAGTGATGCT 234 0 

GAAGGCTAAA GTAAACCTTG CTCGTATCAA AAAGTAAAGA TTGTGCAGAC CTGTTGTAGA 240 0 

ATTCTTTTCA ACAGAGAACA GAAAACTTGT CTCCGAAGTG GGTTTGTGGA AGGAAGCCTG 24 6 0 

CCAAGGCGGC TTGTTCAGAG AAATTGCTCC TTCTGGTTTA TGTCCAGCCT TGATAACACA 252 0 

TATGGGAGCC TACTATGCAG TTTT AAAGC A AGTATCCATG CAGCCTGCAG CCTGGTCATT 2580 

TTTTCTGGGG TGAGGATCTG CCTAGGTAGA AGTTTTCTCT AATTTATTTT GCTGTTACTT 2 640 

GTTATTGCAG ATGGTTCCTT GTCGGGGTGG GGGGTTTATT TGCTTCCCAA TGCTTTTGTT 27 00 

AATCCCGGTG CTGTGTCTTA TGTTGCAGTG GTGGTGGTTC TGGCATTTAT AATTTGCTGG 2760 

TTGCCCTTCC AC GTTGGC AG AATCATTTAC ATAAACACGG AAGATTCGCG GATGATGTAC 282 0 

TTCTCTCAGT ACTTTAACAT CGTCGCTCTG CAACTTTTCT ATCTGAGCGC ATCTATCAAC 2880 

CCAATCCTCT ACAACCTCAT TTCAAAGAAG TACAGAGCGG CGGCCTTTAA ACTGCTGCTC 2940 

GCAAGGAAGT CCAGGCCGAG AGGCTTCCAC AGAAGCAGGG ACACTGCGGG GGAAGTTGCA 3000 

GGGGACACTG GAGGAGACAC GGTGGGCTAC ACCGAGACAA GCGCTAACGT GAAGACGATG 3 060 

GGATAA 3066 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1239 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
{ D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

ATGGGCAGCC CCTGGAACGG CAGCGACGGC CCCGAGGGGG CGCGGGAGCC GCCGTGGCCC 60 

GCGCTGCCGC CTTGCGACGA GCGCCGCTGC TCGCCCTTTC CCCTGGGGGC GCTGGTGCCG 120 

GTGACCGCTG TGTGCCTGTG CCTGTTCGTC GTCGGGGTGA GCGGCAACGT GGTGACCGTG 180 

ATGCTGATCG GGCGCTACCG GGACATGCGG ACCACCACCA ACTTGTACCT GGGCAGCATG 240 

GCCGTGTCCG ACCTACTCAT CCTGCTCGGG CTGCCGTTCG ACCTGTACCG CCTCTGGCGC 300 

TCGCGGCCCT GGGTGTTCGG GCCGCTGCTC TGCCGCCTGT CCCTCTACGT GGGCGAGGGC 360 

TGCACCTACG CCACGCTGCT GCACATGACC GCGCTCAGCG TCGAGCGCTA CCTGGCCATC 420 

TGCCGCCCGC TCCGCGCCCG CGTCTTGGTC ACCCGGCGCC GCGTCCGCGC GCTCATCGCT 480 

GTGC TCTGGG CCGTGGCGCT GCTCTCTGCC GGTCCCTTCT TGTTCCTGGT GGGCGTCGAG 540 

CAGGACCCCG GCATCTCCGT AGTCCCGGGC CTCAATGGCA CCGCGCGGAT CGCCTCCTCG 600 

CCTCTCGCCT CGTCGCCGCC TCTCTGGCTC TCGCGGGCGC CACCGCCGTC CCCGCCGTCG 660 

GGGCCCGAGA CCGCGGAGGC CGCGGCGCTG TTCAGCCGCG AATGCCGGCC GAGCCCCGCG 720 

CAGCTGGGCG CGCTGCGTGT CATGCTGTGG GTCACCACCG C C T ACTTC TT A CCTGCCCTTT 7 80 

CTGTGCCTCA GCATCCTCTA CGGGCTCATC GGGCGGGAGC TGTGGAGCAG CCGGCGGCCG 840 
CTGCGAGGCC CGGCCGCCTC GGGGCGGGAG AGAGGCCACC GGCAGACCGT CCGCGTCCTG 900 
CTGGTGGTGG TTCTGGCATT TATAATTTGC TGGTTGCCCT TCCACGTTGG CAGAATCATT 960 

TACATAAACA CGGAAGATTC GCGGATGATG TACTTCTCTC AGTACTTTAA CATCGTCGCT 1020 

CTGCAACTTT TCTATCTGAG CGCATCTATC AACCCAATCC TCTACAACCT CATTTCAAAG 1080 

AAGTACAGAG CGGCGGCCTT TAAACTGCTG CTCGCAAGGA AGTCC AGGC C GAGAGGCTTC 1140 

CACAGAAGCA GGGACACTGC GGGGGAAGTT GCAGGGGACA CTGGAGGAGA CACGGTGGGC 12 00 

TACACCGAGA CAAGCGCTAA CGTGAAGACG ATGGGATAA 1239 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 412 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

Met Gly Ser Pro Trp Asn Gly Ser Asp Gly Pro Glu Gly Ala Arg Glu 

1 5 10 15 

Pro Pro Trp Pro Ala Leu Pro Pro Cys Asp Glu Arg Arg Cys Ser Pro 

20 25 30 

Phe Pro Leu Gly Ala Leu Val Pro Val Thr Ala Val Cys Leu Cys Leu 

35 40 45 

Phe Val Val Gly Val Ser Gly Asn Val Val Thr Val Met Leu lie Gly 

50 55 60 

Arg Tyr Arg Asp Met Arg Thr Thr Thr Asn Leu Tyr Leu Gly Ser Met 
65 70 75 80 

-3- 
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Ala 


Val 


Ser 


Asp 


Leu 


Leu 


He 


Leu 


Leu 


Gly 


Leu 


Pro 


Phe 


Asp 


Leu 


Tyr 










85 










90 










95 




Arg 


Leu 


Trp 


Arg 


Ser 


Arg 


Pro 


Trp 


Val 


Phe 


Gly 


Pro 


Leu 


Leu 


Cys 


Arg 








100 










105 










110 






Leu 


Ser 


Leu 


Tyr 


Val 


Gly 


Glu 


Gly 


Cys 


Thr 


Tyr 


Ala 


Thr 


Leu 


Leu 


His 






115 










120 










125 








Met 


Thr 


Ala 


Leu 


Ser 


Val 


Glu 


Arg 


Tyr 


Leu 


Ala 


He 


Cys 


Arg 


Pro 


Leu 




130 










135 










140 










Arg 


Ala 


Arg 


Val 


Leu 


Val 


Thr 


Arg 


Arg 


Arg 


Val 


Arg 


Ala 


Leu 


He 


Ala 


145 










150 










155 










160 


Val 


Leu 


Trp 


Ala 


Val 


Ala 


Leu 


Leu 


Ser 


Ala 


Gly 


Pro 


Phe 


Leu 


Phe 


Leu 










165 










170 










175 




Val 


Gly 


Val 


Glu 


Gin 


Asp 


Pro 


Gly 


He 


Ser 


Val 


Val 


Pro 


Gly 


Leu 


Asn 








180 










185 










190 






Gly 


Thr 


Ala 


Arg 


He 


Ala 


Ser 


Ser 


Pro 


Leu 


Ala 


Ser 


Ser 


Pro 


Pro 


Leu 






195 










200 










205 








Trp 


Leu 


Ser 


Arg 


Ala 


Pro 


Pro 


Pro 


Ser 


Pro 


Pro 


Ser 


Gly 


Pro 


Glu 


Thr 




210 










215 










220 










Ala 


Glu 


Ala 


Ala 


Ala 


Leu 


Phe 


Ser 


Arg 


Glu 


Cys 


Arg 


Pro 


Ser 


Pro 


Ala 


225 










230 










235 










240 


Gin 


Leu 


Gly Ala 


Leu 


Arg 


Val 


Met 


Leu 


Trp 


Val 


Thr 


Thr 


Ala 


Tyr 


Phe 










245 










250 










255 




Phe 


Leu 


Pro 


Phe 


Leu 


Cys 


Leu 


Ser 


He 


Leu 


Tyr 


Gly 


Leu 


He 


Gly 


Arg 








260 










265 










270 






Glu 


Leu 


Trp 


Ser 


Ser 


Arg 


Arg 


Pro 


Leu 


Arg 


Gly 


Pro 


Ala 


Ala 


Ser 


Gly 






275 










280 










285 








Arg 


Glu 


Arg 


Gly 


His 


Arg 


Gin 


Thr 


Val 


Arg 


Val 


Leu 


Leu 


Val 


Val 


Val 




290 










295 










300 










Leu 


Ala 


Phe 


lie 


He 


Cys 


Trp 


Leu 


Pro 


Phe 


His 


Val 


Gly 


Arg 


He 


He 


305 










310 








••- 


315 










320 


Tyr 


He 


Asn 


Thr 


Glu 


Asp 


Ser 


Arg 


Met 


Met 


Tyr 


Phe 


Ser 


Gin 


Tyr 


Phe 










325 










330 










335 




Asn 


lie 


Val 


Ala 


Leu 


Gin 


Leu 


Phe 


Tyr 


I.eu 


Ser 


Ala 


Ser 


He 


Asn 


Pro 








340 










345 










350 






He 


Leu 


Tyr 


Asn 


Leu 


He 


Ser 


Lys 


Lys 


Tyr 


Arg 


Ala 


Ala 


Ala 


Phe 


Lys 






355 










360 










365 








Leu 


Leu 


Leu 


Ala 


Arg 


Lys 


Ser 


Arg 


Pro 


Arg 


Gly 


Phe 


His 


Arg 


Ser 


Arg 




370 










375 










380 










Asp 


Thr 


Ala 


Gly 


Glu 


Val 


Ala 


Gly 


Asp 


Thr 


Gly 


Gly 


Asp 


Thr 


Val 


Gly 


385 










390 










395 










400 


Tyr 


Thr 


Glu 


Thr 


Ser 


Ala 


Asn 


Val 


Lys 


Thr 


Met 


Gly 











405 410 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1390 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
ATGGGCAGCC CCTGGAACGG CAGCGACGGC CCCGAGGGGG CGCGGGAGCC GCCGTGGCCC 60 
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GCGCTGCCGC CTTGCGACGA GCGCCGCTGC TCGCCCTTTC CCCTGGGGGC GCTGGTGCCG 120 

GTGACCGCTG TGTGCCTGTG CCTGTTCGTC GTCGGGGTGA GCGGCAACGT GGTGACCGTG 180 

ATGCTGATCG GGCGCTACCG GGACATGCGG ACCACCACCA ACTTGTACCT GGGCAGCATG 240 

GCCGTGTCCG AC CTACTC AT CCTGCTCGGG CTGCCGTTCG ACCTGTACCG CCTCTGGCGC 300 

TCGCGGCCCT GGGTGTTCGG GCCGCTGCTC TGCCGCCTGT CCCTCTACGT GGGCGAGGGC 360 

TGCACCTACG CCACGCTGCT GCACATGACC GCGCTCAGCG TCGAGCGCTA CCTGGCCATC 420 

TGCCGCCCGC TCCGCGCCCG CGTCTTGGTC ACCCGGCGCC GCGTCCGCGC GCTCATCGCT 480 

GTGCTCTGGG CCGTGGCGCT GCTCTCTGCC GGTCCCTTCT TGTTCCTGGT GGGCGTCGAG 540 

CAGGACCCCG GCATCTCCGT AGTCCCGGGC CTCAATGGCA CCGCGCGGAT CGCCTCCTCG 600 

CCTCTCGCCT CGTCGCCGCC TCTCTGGCTC TCGCGGGCGC CACCGCCGTC CCCGCCGTCG 660 

GGGCCCGAGA CCGCGGAGGC CGCGGCGCTG TTCAGCCGCG AATGCCGGCC GAGCCCCGCG 720 

CAGCTGGGCG CGCTGCGTGT CATGCTGTGG GTCACCACCG CCTACTTCTT CCTGCCCTTT 780 

CTGTGCCTCA GCATCCTCTA CGGGCTCATC GGGCGGGAGC TGTGGAGCAG CCGGCGGCCG 84 Q 

CTGCGAGGCC CGGCCGCCTC GGGGCGGGAG AGAGGCCACC GGCAGACCGT CCGCGTCCTG 900 

CGTAAGTGGA GCCGCCGTGG TTCCAAAGAC GCCTGCCTGC AGTCCGCCCC GCCGGGGACC 960 

GCGCAAACGC TGGGTCCCCT TCCCCTGCTC GCCCAGCTCT GGGCGCCGCT TCCAGCTCCC 1020 

TTTCCTATTT CGATTC C AGC CTCCACCCGC CGTGGTGGTG GTTCTGGCAT TTATAATTTG 1080 

CTGGTTGCCC TTCCACGTTG GCAGAATCAT TTACATAAAC ACGGAAGATT CGCGGATGAT 1140 

GTACTTCTCT CAGTACTTTA ACATCGTCGC TCTGCAACTT TTCTATCTGA GCGCATCTAT 1200 

CAACCCAATC CTCTACAACC TCATTTCAAA GAAGTACAGA GCGGCGGCCT TTAAACTGCT 1260 

GCTCGCAAGG AAGTCCAGGC CGAGAGGCTT C C AC AG AAGC AGGGACACTG CGGGGGAAGT 1320 

TGCAGGGGAC ACTGGAGGAG ACACGGTGGG CTACACCGAG ACAAGCGCTA ACGTGAAGAC 1380 

GATGGGATAA 1390 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 386 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



Met 


Gly 


Ser 


Pro 


Trp 


Asn 


Gly 


Ser 


Asp 


Gly 


Pro 


Glu 


Gly Ala 


Arg 


Glu 


1 






5 










10 










15 




Pro 


Pro 


Trp 


Pro 


Ala 


Leu 


Pro 


Pro 


Cys 


Asp 


Glu 


Arg 


Arg 


Cys 


Ser 


Pro 






20 










25 










30 






Phe 


Pro 


Leu 


Gly Ala 


Leu 


Val 


Pro 


Val 


Thr 


Ala 


Val 


Cys 


Leu 


Cys 


Leu 






35 










40 










45 








Phe 


Val 


Val 


Gly Val 


Ser 


Gly 


Asn 


Val 


Val 


Thr 


Val 


Met 


Leu 


He 


Gly 




50 










55 










60 










Arg 


Tyr 


Arg 


Asp 


Met 


Arg 


Thr 


Thr 


Thr 


Asn 


Leu 


Tyr 


Leu 


Gly 


Ser 


Met 


65 








70 










75 










80 


Ala 


Val 


Ser 


Asp 


Leu 


Leu 


He 


Leu 


Leu 


Gly 


Leu 


Pro 


Phe 


Asp 


Leu 


Tyr 








85 










90 










95 




Arg 


Leu 


Trp 


Arg 


Ser 


Arg 


Pro 


Trp 


Val 


Phe 


Gly 


Pro 


Leu 


Leu 


Cys 


Arg 






100 










105 










110 






Leu 


Ser 


Leu 


Tyr 


Val 


Gly 


Glu 


Gly Cys 


Thr 


Tyr 


Ala 


Thr 


Leu 


Leu 


His 






115 








120 










125 








Met 


Thr 
130 


Ala 


Leu 


Ser 


Val 


Glu 
135 


Arg 


Tyr 


Leu 


Ala 


He 
140 


Cys 


Arg 


Pro 


Leu 


Arg 


Ala 


Arg 


Val 


Leu 


Val 


Thr 


Arg 


Arg 


Arg 


Val 


Arg 


Ala 


Leu 


He 


Ala 


145 








150 










155 










160 
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Val 


Leu 


Trp 


Ala 


Val 
165 


Ala 


Leu 


Leu 


Ser 


Ala 
170 


Gly 


Pro 


Phe 


Leu 


Phe 
175 


Leu 


Val 


Gly 


Val 


Glu 
180 


Gin 


Asp 


Pro 


Gly 


He 
185 


Ser 


Val 


Val 


Pro 


Gly 
190 


Leu 


Asn 


Gly 


Thr 


Ala 
195 


Arg 


He 


Ala 


Ser 


Ser 
200 


Pro 


Leu 


Ala 


Ser 


Ser 
205 


Pro 


Pro 


Leu 


Trp 


Leu 


Ser 


Arg 


Ala 


Pro 


Pro 


Pro 


Ser 


Pro 


Pro 


Ser 


Gly 


Pro 


Glu 


Thr 




210 










215 










220 








Ala 


Glu 


Ala 


Ala 


Ala 


Leu 


Phe 


Ser 


Arg 


Glu 


Cys 


Arg 


Pro 


Ser 


Pro 


Ala 


225 










230 










235 










240 


Gin 


Leu 


Gly 


Ala 


Leu 
245 


Arg 


Val 


Met 


Leu 


Trp 
250 


Val 


Thr 


Thr 


Ala 


Tyr 
255 


Phe 


Phe 


Leu 


Pro 


Phe 
260 


Leu 


Cys 


Leu 


Ser 


He 
265 


Leu 


Tyr 


Gly 


Leu 


He 
270 


Gly 


Arg 


Glu 


Leu 


Trp 


Ser 


Ser 


Arg 


Arg 


Pro 


Leu 


Arg 


Gly 


Pro 


Ala 


Ala 


Ser 


Gly 






275 










280 










285 






Arg 


Glu 


Arg 


Gly 


His 


Arg 


Gin 


Thr 


Val 


Arg 


Val 


Leu 


Arg 


Lys 


Trp 


Ser 




290 










295 










300 






Arg 


Arg 


Gly 


Ser 


Lys 


Asp 


Ala 


Cys 


Leu 


Gin 


Ser 


Ala 


Pro 


Pro 


Gly 


Thr 


305 










310 










315 








320 


Ala 


Gin 


Thr 


Leu 


Gly 


Pro 


Leu 


Pro 


Leu 


Leu 


Ala 


Gin 


Leu 


Trp 


Ala 


Pro 










325 










330 








335 




Leu 


Pro 


Ala 


Pro 


Phe 


Pro 


He 


Ser 


He 


Pro 


Ala 


Ser 


Thr 


Arg 


Arg 


Gly 








340 










345 










350 




Gly 


Gly 


Ser 


Gly 


He 


Tyr 


Asn 


Leu 


Leu 


Val 


Ala 


Leu 


Pro 


Arg 


Trp 


Gin 






355 










360 










365 




Asn 


His 
370 


Leu 


His 


Lys 


His 


Gly 
375 


Arg 


Phe 


Ala 


Asp 


Asp 
380 


Val 


Leu 


Leu 


Ser 



Val Leu 



385 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1092 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

ATGCCCTGGA CCAGACCCCA GGTGGACCTC CATGCTGCTG CAGCAGAGAC C ATGGAC C AG 60 

TAC AC C AC GG ACGACCACCA CTACGAGGGC TCCCTCTTCC CCGCGTCCAC CCTCATCCCC 12 0 

GTCACGGTCA TCTGCATCCT CATCTTCGTG GTCGGCGTGA CCGGCAACAC CATGACCATC 18 0 

CTCATCATCC AGTACTTCAA GGACATGAAG ACCACCACCA ACCTGTACCT GTCCAGCATG 24 0 

GCCGTGTCCG ACCTCGTCAT CTTCCTCTGC CTGCCCTTCG ACCTGTACCG CCTGTGGAAG 300 

TACGTGCCGT GGCTGTTCGG CGAGGCCGTG TGCCGCCTCT ACCACTACAT CTTCGAAGGC 36 0 

TGCACGTCGG CCACCATCCT CCACATCACG GCCCTGAGCA TCGAGCGCTA CCTGGCCATC 42 0 

AGCTTCCCCC TCAGGAGCAA GGTGATGGTG ACCAGGAGAA GGGTCCAGTA CATCATCCTG 480 

GCCCTGTGGT GCTTCGCCCT GGTGTCGGCC GCTCCCACGC TCTTCCTGGT CGGGGTGGAG 54 0 

TACGACAACG AGACGCACCC CGACTACAAC ACGGGCCAGT GCAAGCACAC GGGCTACGCC 600 

ATCAGCTCGG GGCAGCTGCA CATCATGATC TGGGTGTCCA CCACCTACTT CTTCTGCCCG 660 

ATGCTGTGTC TCCTCTTCCT CTACGGCTCC ATCGGGTGCA AGCTGTGGAA GAGCAAGAAC 720 

GACCTGCAGG GCCCGTGCGC CCTGGCCCGC GAGAGGTCGC ACAGGCAAAC GGTGAAGATC 7 80 
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CTGGTGGTGG TGGTGCTGGC CTTCATCATC TGCTGGCTGC CCTACCACAT CGGCAGGAAC 840 

CTGTTCGCCC AGGTGGACGA CTACGAC AC G GCCATGCTCA GCCAGAATTT CAACATGGCC 900 

TCCATGGTGC TCTGCTACCT CAGCGCCTCC ATCAACCCCG TCGTCTACAA CCTGATGTCG 960 

AGGAAGTACC GGGCCGCCGC CAAGCGCCTC TTCCTGCTCC ACCAGAGACC CAAGCCGGCC 1020 

CACCGGGGGC AGGGGCAGTT TTGCATGATC GGCCACAGCC CCACCCTGGA CGAGAGCCTG 1080 

ACGGGGGTGT GA 1092 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 363 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



Met 


Pro 


Trp 


Thr 


Arg 


Pro 


Gin 


Val 


Asp 


Leu 


His 


Ala 


Ala 


Ala 


Ala 


Glu 


1 








5 










10 










15 




Thr 


Met 


Asp 


Gin 
20 


Tyr 


Thr 


Thr 


Asp 


Asp 
25 


HIS 


HIS 


Tyr 


Glu 


Gly 


Ser 


Leu 


Phe 


Pro 


Ala 
35 


Ser 


Thr 


Leu 


He 


Pro 
40 


Val 


i nr 


vai 


lie 


Cys 


lie 


Leu 


lie 


Phe 


Val 


Val 


Gly Val 


i nr 


Gly Asn 


Thr 


Met 


1 11 J_ 




Leu 


Tin 

ne 


Tin 

lie 


vjin 




50 










55 










60 










Tyr 


Phe 


Lys 


Asp 


Met 


Lys 


Thr 


Thr 


Thr 


Asn 


Leu 


iyx 


Leu 


OCI 




Mat- 
He t- 


65 










70 










75 










80 


nld 


Val 


Ser 


Asp 


Leu 
85 


V Cl-L 


He 


Phe 


Leu 


Cys 
90 


Leu 


Pro 


Phe 




95 


xyir 


Arg 


Leu 


Trp 


Lys 
100 


Tyr 


Val 


Pro 


Trp 


Leu 
105 


Phe 


Gly 


Glu 


Ala 


Val 
110 


Cys 


Arg 


Leu 


Tyr 


His 
115 


Tyr 


He 


Phe 


Glu 


Gly 
120 


Cys 


Thr 


Ser 


Ala 


Thr 
125 


He 


Leu 


His 


He 


Thr 
130 


Ala 


Leu 


Ser 


He 


Glu 
135 


Arg 


Tyr 


Leu 


Ala 


He 
140 


Ser 


Phe 


Pro 


Leu 


Arg 


Ser 


Lys 


Val 


Met 


Val 


Thr 


Arg 


Arg 


Arg 


Val 


Gin 


Ty r 


He 


He 


Leu 


145 










150 










155 










160 


Ala 


Leu 


Trp 


Cys 


Phe 
165 


Ala 


Leu 


Val 


Ser 


Ala 
170 


Ala 


Pro 


Thr 


Leu 


Phe 
175 


Leu 


Val 


Gly Val 


Glu 


Tyr 


Asp 


Asn 


Glu 


Thr 


His 


Pro 


Asp 


Tyr 


Asn 


Thr 


Gly 








180 










185 










190 






Gin 


Cys 


Lys 
195 


His 


Thr 


Gly 


Tyr 


Ala 
200 


He 


Ser 


Ser 


Gly 


Gin 
205 


Leu 


His 


He 


Met 


He 
210 


Trp 


Val 


Ser 


Thr 


Thr 
215 


Tyr 


Phe 


Phe 


Cys 


Pro 
220 


Met 


Leu 


Cys 


Leu 


Leu 


Phe 


Leu 


Tyr 


Gly 


Ser 


He 


Gly Cys 


Lys 


Leu 


Trp 


Lys 


Ser 


Lys 


Asn 


225 










230 










235 










240 


Asp 


Leu 


Gin 


Gly 


Pro 


Cys 


Ala 


Leu 


Ala Arg 


Glu 


Arg 


Ser 


His 


Arg 


Gin 










245 










250 










255 




Thr 


Val 


Lys 


He 
260 


Leu 


Val 


Val 


Val 


Val 
265 


Leu 


Ala 


Phe 


He 


He 
270 


Cys 


Trp 


Leu 


Pro 


Tyr 
275 


His 


He 


Gly 


Arg 


Asn 
280 


Leu 


Phe 


Ala 


Gin 


Val 
285 


Asp 


Asp 


Tyr 
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Asp 


Thr 


Ala 


Met 


Leu 


Ser 


Gin 


Asn 


Phe 


Asn 


Met 


Ala Ser 


Met 


Val Leu 




290 










295 










300 






Cys 


Tyr 


Leu 


Ser 


Ala 


Ser 


lie 


Asn 


Pro 


Val 


Val 


Tyr Asn Leu Met Ser 


305 










310 










315 






32 0 


Arg 


Lys 


Tyr 


Arg 


Ala 


Ala 


Ala 


Lys 


Arg 


Leu 


Phe 


Leu Leu 


His 


Gin Arg 










325 










330 








335 


Pro 


Lys 


Pro 


Ala 


His 


Arg 


Gly 


Gin 


Gly 


Gin 


Phe 


Cys Met 


lie 


Gly His 








340 










345 








350 




Ser 


Pro 


Thr 


Leu 


Asp 


Glu 


Ser 


Leu 


Thr 


Gly Val 












355 










360 















(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear - 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 
CCATCCTAAT ACGACTCACT ATAGGGC 27 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid -■- , 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TTATCCCATC GTCTTCACGT TAGCGCTTGT CTC 33 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CTGCCCTTTC TGTGCCTCAG CATCCTCTAC 30 
(2) INFORMATION FOR SEQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 900 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATGGGCAGCC CCTGGAACGG CAGCGACGGC CCCGAGGGGG CGCGGGAGCC GCCGTGGCCC 60 

GCGCTGCCGC CTTGCGACGA GCGCCGCTGC TCGCCCTTTC CCCTGGGGGC GCTGGTGCCG 12 0' 

GTGACCGCTG TGTGCCTGTG CCTGTTCGTC GTCGGGGTGA GCGGCAACGT GGTGACCGTG 180 

ATGCTGATCG GGCGCTACCG GGACATGCGG ACCACCACCA ACTTGTACCT GGGCAGCATG 240. 

GCCGTGTCCG ACCTACTCAT CCTGCTCGGG CTGCCGTTCG ACCTGTACCG CCTCTGGCGC 300 

TCGCGGCCCT GGGTGTTC GG GCCGCTGCTC TGCCGCCTGT CCCTCTACGT GGGCGAGGGC 360 

TGC AC CTACG CCACGCTGCT GCACATGACC GCGCTCAGCG TCGAGCGCTA CCTGGCCATC 420 

TGCCGCCCGC TCCGCGCCCG CGTCTTGGTC ACCCGGCGCC GCGTCCGCGC GCTCATCGCT 480 

GTGCTCTGGG CCGTGGCGCT GCTCTCTGCC GGTCCCTTCT TGTTCC TGGT GGGC GTCGAG 540 

CAGGACCCCG GCATCTCCGT AGTCCCGGGC CTCAATGGCA CCGCGCGGAT CGCCTCCTCG 600 

CCTCTCGCCT CGTCGCCGCC TCTCTGGCTC TCGCGGGCGC CACCGCCGTC CCCGCCGTCG 660 

GGGCCCGAGA CCGCGGAGGC CGCGGCGCTG TTCAGCCGCG AATGCCGGCC GAGCCCCGCG 720 

CAGCTGGGCG CGCTGCGTGT CATGCTGTGG GTCACCACCG CCTACTTCTT CCTGCCCTTT 780 

CTGTGCCTCA GCATCCTCTA CGGGCTCATC GGGCGGGAGC TGTGGAGCAG CCGGCGGCCG 840 

CTGCGAGGCC CGGCCGCCTC GGGGCGGGAG AGAGGCCACC GGCAGACCGT CCGCGTCCTG 900 

(2) INFORMATION FOR SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 amino acids 

(B) TYPE: amino acid 

(C) : STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 



Met 


Gly 


Ser 


Pro Trp 


Asn 


Gly 


Ser 


Asp 


Gly 


Pro 


Glu 


Gly 


Ala 


Arg 


Glu 


1 






5 










10 










15 




Pro 


Pro 


Trp 


Pro Ala 
20 


Leu 


Pro 


Pro 


Cys 
25 


Asp 


Glu 


Arg 


Arg 


Cys 
30 


Ser 


Pro 


Phe 


Pro 


Leu 
35 


Gly Ala 


Leu 


Val 


Pro 
40 


Val 


Thr 


Ala 


Val 


Cys 
45 


Leu 


Cys 


Leu 


Phe 


Val 
50 


Val 


Gly Val 


Ser 


Gly 
55 


Asn 


Val 


Val 


Thr 


Val 
60 


Met 


Leu 


He 


Gly 


Arg 


Tyr 


Arg 


Asp Met 


Arg 


Thr 


Thr 


Thr 


Asn 


Leu 


Tyr 


Leu 


Gly 


Ser 


Met 


65 








70 










75 










80 


Ala 


Val 


Ser 


Asp Leu 
85 


Leu 


He 


Leu 


Leu 


Gly 
90 


Leu 


Pro 


Phe 


Asp 


Leu 
95 


Tyr 


Arg 


Leu 


Trp 


Arg Ser 
100 


Arg 


Pro 


Trp 


Val 
105 


Phe 


Gly 


Pro 


Leu 


Leu 
110 


Cys 


Arg 


Leu 


Ser 


Leu 
115 


Tyr Val 


Gly 


Glu 


Gly 
120 


Cys 


Thr 


Tyr 


Ala 


Thr 
125 


Leu 


Leu 


His 


Met 


Thr 
130 


Ala 


Leu Ser 


Val 


Glu 
135 


Arg 


Tyr 


Leu 


Ala 


He 
140 


Cys 


Arg 


Pro 


Leu 
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Arg 


Ala 


Arg 


Val 


Leu 


Val 


Thr 


Arg 


Arg 


Arg 


Val 


Arg 


Ala 


Leu 


lie 


Ala 


145 










150 










155 










160 


Val 


Leu 


Trp 


Ala 


Val 


Ala 


Leu 


Leu 


Ser 


Ala 


Gly 


Pro 


Phe 


Leu 


Phe 


Leu 








165 










170 










175 




Val 


Gly Val 


Glu 


Gin 


Asp 


Pro 


Gly 


He 


Ser 


Val 


Val 


Pro 


Gly 


Leu 


Asn 








180 










185 










190 






Gly 


Thr 


Ala 


Arg 


He 


Ala 


Ser 


Ser 


Pro 


Leu 


Ala 


Ser 


Ser 


Pro 


Pro 


Leu 




195 










200 










205 








Trp 


Leu 


Ser 


Arg 


Ala 


Pro 


Pro 


Pro 


Ser 


Pro 


Pro 


Ser 


Gly 


Pro 


Glu 


Thr 


210 










215 










220 










Ala 


Glu 


Ala 


Ala 


Ala 


Leu 


Phe 


Ser 


Arg 


Glu 


Cys 


Arg 


Pro 


Ser 


Pro 


Ala 


225 










230 










235 










240 


Gin 


Leu 


Gly 


Ala 


Leu 


Arg 


Val 


Met 


Leu 


Trp 


Val 


Thr 


Thr 


Ala 


Tyr 


Phe 








245 










250 










255 




Phe 


Leu 


Pro 


Phe 
260 


Leu 


Cys 


Leu 


Ser 


He 
265 


Leu 


Tyr 


Gly 


Leu 


He 
270 


Gly 


Arg 


Glu 


Leu 


Trp 
275 


Ser 


Ser 


Arg 


Arg 


Pro 
280 


Leu 


Arg 


Gly 


Pro 


Ala 
285 


Ala 


Ser 


Gly 


Arg 


Glu Arg Gly His 


Arg 


Gin 


Thr 


Val 


Arg 


Val 


Leu 











290 295 300 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA~ 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

CGTAAGTGGA GCCGCCGTGG TTCCAAAGAC GCCTGCCTGC AGTCCGCCCC GCCGGGGACC 60 
GCGCAAACGC TGGGTCCCCT TCCCCTGCTC GCCCAGCTCT GGGCGCCGCT TCCAGCTCCC 120 
TTTCCTATTT CGATTCCAGC CTCCACCCGC CGGT 154 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 602 base pairs 

(B) TYPE: nucleic acid 

(C ) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

AGCTGGTGGT GGTTCTGGCA TTTATAATTT GCTGGTTGCC CTTCCACGTT GGCAGAATCA 60 

TTTACATAAA CACGGAAGAT TCGCGGATGA TGTACTTCTC TCAGTACTTT AACATCGTCG 120 

CTCTGCAACT TTTCTATCTG AGCGCATCTA TCAACCCAAT CCTCTACAAC CTCATTTCAA 180 

AGAAGTACAG AGCGGCGGCC TTTAAACTGC TGCTCGCAAG GAAGTCCAGG CCGAGAGGCT 240 

TCCACAGAAG CAGGGACACT GCGGGGGAAG TTGCAGGGGA CACTGGAGGA GACACGGTGG 3 00 

GCTACACCGA GACAAGCGCT AACGTGAAGA CGATGGGATA ACGTAAGTGG AGCCGCCGTG 3 60 

GTTCCAAAGA CGCCTGCCTG CAGTCCGCCC CGCCGGGGAC CGCGCAAACG CTGGGTCCCC 42 0 
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TTCCCCTGCT CGCCCAGCTC TGGGCGCCGC TTCCAGCTCC CTTTCCTATT TCQATTCCAG 480 

CCTCCACCCG CCGTGGTGGT GGTTCTGGCA TTTATAATTT GCTGGTTGCC CTTCCACGTT 540 

GGCAGAATCA TTTACATAAA CACGGAAGAT TCGCGGATGA TGTACTTCTC TCAGTACTTT 600 

AA 602 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 198 amino acids 

(B) TYPE: amino acid ^ 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID J&K> : 15 : 

Leu Val Val Val Leu Ala Phe He 11* ^/s Trp Leu Pro Phe His Val 

1 5 10 15 

Gly Arg He He Tyr He Asn THr Glu Asp Ser Arg Met Met Tyr Phe 

20 ' >' 25 30 

Ser Gin Tyr Phe Asn Ilr ,V&1 Ala Leu Gin Leu Phe Tyr Leu Ser Ala 

35 '40 45 

Ser lie Asn Pro Tie Leu Tyr Asn Leu He Ser Lys Lys Tyr Arg Ala aJ 

50 55 60 

Ala Ala Phe Lya Leu Leu Leu Ala Arg Lys Ser Arg Pro Arg Gly Phe . 
65 70 75 80 

His Ar. Ser Arg Asp Thr Ala Gly Glu Val Ala Gly Asp Thr G^y Gly 

85 90 95 

Asp Thr Val Gly Tyr Thr Glu Thr Ser Ala Asn Val Lys Thr Met Gly 

100 105 HO 

Arg Lys Trp Ser Arg Arg Gly Ser Lys Asp Ala Cys Leu Gin Ser Ala 

115 120 125 

Pro Pro Gly Thr Ala Gin Thr Leu Gly Pro Leu Pro Leu Leu Ala Gin 

130 " 135 140 

Leu Trp Ala Pro Leu Pro Ala Pro Phe Pro He Ser He Pro Ala Ser 
145 * 150 155 160 

Thr Arg Arg Gly Gly Gly Ser Gly He Tyr Asn Leu Leu Val Ala Leu 

165 17 0 175 

Pro Arg Trp Gin Asn His Leu His Lys His Gly Arg Phe Ala Asp Asp 

180 185 190 

Val Leu Leu Ser Val Leu 
195 
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